Search Result

Select

Family relation extraction from Wikipedia by self-supervised learning

ZHU Suyang, HUI Haotian, QIAN Longhua, ZHANG Min

Journal of Computer Applications 2015, 35 (4): 1013-1016. DOI: 10.11772/j.issn.1001-9081.2015.04.1013

Abstract （531）

PDF （773KB）（670）

Save

Traditional supervised relation extraction demands a large scale of manually annotated training data while semi-supervised learning suffers from low recall. A self-supervised learning based approach was proposed to extract personal family relationships. First, semi-structured information (family relation triples) was mapped to the free text in Chinese Wikipedia to automatically generate annotated training data. Then family relations between person entities were extracted from Wikipedia text with feature-based relation extraction method. The experimental results on a manually annotated test family network show that this method outperforms Bootstrapping with F1-measure of 77%, implying that self-supervised learning can effectively extract personal family relationships.

Reference | Related Articles | Metrics

Select

Chinese cross document co-reference resolution based on SVM classification and semantics

ZHAO Zhiwei GU Jinghang HU Yanan QIAN Longhua ZHOU Guodong

Journal of Computer Applications 2013, 33 (04): 984-987. DOI: 10.3724/SP.J.1087.2013.00984

Abstract （1005）

PDF （642KB）（582）

Save

The task of Cross-Document Co-reference Resolution (CDCR) aims to merge those words distributed in different texts which refer to the same entity together to form co-reference chains. The traditional research on CDCR addresses name disambiguation posed in information retrieval using clustering methods. This paper transformed CDCR as a classification problem by using an Support Vector Machine (SVM) classifier to resolve both name disambiguation and variant consolidation, both of which were prevalent in information extraction. This method can effectively integrate various features, such as morphological, phonetic, and semantic knowledge collected from the corpus and the Internet. The experiment on a Chinese cross-document co-reference corpus shows the classification method outperforms clustering methods in both precision and recall.

Reference | Related Articles | Metrics

Select

Comparative analysis of impact of lexical semantic information on Chinese entity relation extraction

LIU Dan-dan PENG Cheng QIAN Long-hua ZHOU Guo-dong

Journal of Computer Applications 2012, 32 (08): 2238-2244. DOI: 10.3724/SP.J.1087.2012.02238

Abstract （929）

PDF （1150KB）（397）

Save

A method was proposed to incorporate semantic information based on TongYiCi CiLin and HowNet into tree kernel-based Chinese relation extraction, the impact of these two kinds of semantic information on Chinese entity relation extraction was compared and analyzed, and the interrelation between lexical semantic information and entity type information was explored. The experimental results show that this method can improve the performance of Chinese relation extraction in some degree, and TongYiCi CiLin can complement the entity type information to a certain extent. Therefore, no matter whether the entity type information is involved or not, its semantic information can significantly improve the extraction performance for most of the relation types, while some conflicts exist between HowNet and the entity type information, leading to its performance improvements only for several relation types when entity types are provided.

Reference | Related Articles | Metrics